{
  "cells": [
    {
      "cell_type": "code",
      "execution_count": null,
      "metadata": {
        "collapsed": false
      },
      "outputs": [],
      "source": [
        "%matplotlib inline"
      ]
    },
    {
      "cell_type": "markdown",
      "metadata": {},
      "source": [
        "\nCompiling and Optimizing a Model with TVMC\n==========================================\n**Authors**:\n`Leandro Nunes <https://github.com/leandron>`_,\n`Matthew Barrett <https://github.com/mbaret>`_,\n`Chris Hoge <https://github.com/hogepodge>`_\n\nIn this section, we will work with TVMC, the TVM command line driver. TVMC is a\ntool that exposes TVM features such as auto-tuning, compiling, profiling and\nexecution of models through a command line interface.\n\nUpon completion of this section, we will have used TVMC to accomplish the\nfollowing tasks:\n\n* Compile a pre-trained ResNet 50 v2 model for the TVM runtime.\n* Run a real image through the compiled model, and interpret the output and\n  model performance.\n* Tune the model on a CPU using TVM.\n* Re-compile an optimized model using the tuning data collected by TVM.\n* Run the image through the optimized model, and compare the output and model\n  performance.\n\nThe goal of this section is to give you an overview of TVM and TVMC's\ncapabilities, and set the stage for understanding how TVM works.\n\n"
      ]
    },
    {
      "cell_type": "markdown",
      "metadata": {},
      "source": [
        "Using TVMC\n----------\n\nTVMC is a Python application, part of the TVM Python package.\nWhen you install TVM using a Python package, you will get TVMC as\nas a command line application called ``tvmc``. The location of this command\nwill vary depending on your platform and installation method.\n\nAlternatively, if you have TVM as a Python module on your\n``$PYTHONPATH``,you can access the command line driver functionality\nvia the executable python module, ``python -m tvm.driver.tvmc``.\n\nFor simplicity, this tutorial will mention TVMC command line using\n``tvmc <options>``, but the same results can be obtained with\n``python -m tvm.driver.tvmc <options>``.\n\nYou can check the help page using:\n\n.. code-block:: bash\n\n  tvmc --help\n\nThe main features of TVM available to ``tvmc`` are from subcommands\n``compile``, and ``run``, and ``tune``.  To read about specific options under\na given subcommand, use ``tvmc <subcommand> --help``. We will cover each of\nthese commands in this tutorial, but first we need to download a pre-trained\nmodel to work with.\n\n\n"
      ]
    },
    {
      "cell_type": "markdown",
      "metadata": {},
      "source": [
        "Obtaining the Model\n-------------------\n\nFor this tutorial, we will be working with ResNet-50 v2. ResNet-50 is a\nconvolutional neural network that is 50-layers deep and designed to classify\nimages. The model we will be using has been pre-trained on more than a\nmillion images with 1000 different classifications. The network has an input\nimage size of 224x224. If you are interested exploring more of how the\nResNet-50 model is structured, we recommend downloading `Netron\n<https://netron.app>`, a freely available ML model viewer.\n\nFor this tutorial we will be using the model in ONNX format.\n\n.. code-block:: bash\n\n  wget https://github.com/onnx/models/raw/master/vision/classification/resnet/model/resnet50-v2-7.onnx\n\n\n"
      ]
    },
    {
      "cell_type": "markdown",
      "metadata": {},
      "source": [
        "<div class=\"alert alert-info\"><h4>Note</h4><p>Supported model formats\n\n  TVMC supports models created with Keras, ONNX, TensorFlow, TFLite\n  and Torch. Use the option``--model-format`` if you need to\n  explicitly provide the model format you are using. See ``tvmc\n  compile --help`` for more information.</p></div>\n\n\n"
      ]
    },
    {
      "cell_type": "markdown",
      "metadata": {},
      "source": [
        "<div class=\"alert alert-info\"><h4>Note</h4><p>Adding ONNX Support to TVM\n\n   TVM relies on the ONNX python library being available on your system. You can\n   install ONNX using the command ``pip3 install --user onnx onnxoptimizer``. You\n   may remove the ``--user`` option if you have root access and want to install\n   ONNX globally.  The ``onnxoptimizer`` dependency is optional, and is only used\n   for ``onnx>=1.9``.</p></div>\n\n\n"
      ]
    },
    {
      "cell_type": "markdown",
      "metadata": {},
      "source": [
        "Compiling an ONNX Model to the TVM Runtime\n------------------------------------------\n\nOnce we've downloaded the ResNet-50 model, the next step is to compile it. To\naccomplish that, we are going to use ``tvmc compile``. The output we get from\nthe compilation process is a TAR package of the model compiled to a dynamic\nlibrary for our target platform. We can run that model on our target device\nusing the TVM runtime.\n\n.. code-block:: bash\n\n  tvmc compile \\\n  --target \"llvm\" \\\n  --output resnet50-v2-7-tvm.tar \\\n  resnet50-v2-7.onnx\n\nLet's take a look at the files that ``tvmc compile`` creates in the module:\n\n.. code-block:: bash\n\n\tmkdir model\n\ttar -xvf resnet50-v2-7-tvm.tar -C model\n\tls model\n\nYou will see three files listed.\n\n* ``mod.so`` is the model, represented as a C++ library, that can be loaded\n  by the TVM runtime.\n* ``mod.json`` is a text representation of the TVM Relay computation graph.\n* ``mod.params`` is a file containing the parameters for the pre-trained\n  model.\n\nThis module can be directly loaded by your application, and the model can be\nrun via the TVM runtime APIs.\n\n"
      ]
    },
    {
      "cell_type": "markdown",
      "metadata": {},
      "source": [
        "<div class=\"alert alert-info\"><h4>Note</h4><p>Defining the Correct Target\n\n  Specifying the correct target (option ``--target``) can have a huge\n  impact on the performance of the compiled module, as it can take\n  advantage of hardware features available on the target. For more\n  information, please refer to `Auto-tuning a convolutional network\n  for x86 CPU <https://tvm.apache.org/docs/tutorials/autotvm/tune_relay_x86.html#define-network>`_.\n  We recommend identifying which CPU you are running, along with optional features,\n  and set the target appropriately.</p></div>\n\n\n"
      ]
    },
    {
      "cell_type": "markdown",
      "metadata": {},
      "source": [
        "Running the Model from The Compiled Module with TVMC\n----------------------------------------------------\n\nNow that we've compiled the model to this module, we can use the TVM runtime\nto make predictions with it. TVMC has the TVM runtime built in to it,\nallowing you to run compiled TVM models. To use TVMC to run the model and\nmake predictions, we need two things:\n\n- The compiled module, which we just produced.\n- Valid input to the model to make predictions on.\n\nEach model is particular when it comes to expected tensor shapes, formats and\ndata types. For this reason, most models require some pre and\npost-processing, to ensure the input is valid and to interpret the output.\nTVMC has adopted NumPy's ``.npz`` format for both input and output data. This\nis a well-supported NumPy format to serialize multiple arrays into a file\n\nAs input for this tutorial, we will use the image of a cat, but you can feel\nfree to substitute image for any of your choosing.\n\n![](https://s3.amazonaws.com/model-server/inputs/kitten.jpg)\n\n   :height: 224px\n   :width: 224px\n   :align: center\n\n"
      ]
    },
    {
      "cell_type": "markdown",
      "metadata": {},
      "source": [
        "Input pre-processing\n~~~~~~~~~~~~~~~~~~~~\n\nFor our ResNet 50 V2 model, the input is expected to be in ImageNet format.\nHere is an example of a script to pre-process an image for ResNet 50 V2.\n\nYou will need to have a supported version of the Python Image Library\ninstalled. You can use ``pip3 install --user pillow`` to satisfy this\nrequirement for the script.\n\n.. code-block:: python\n   :caption: preprocess.py\n   :name: preprocess.py\n\n    #!python ./preprocess.py\n    from tvm.contrib.download import download_testdata\n    from PIL import Image\n    import numpy as np\n\n    img_url = \"https://s3.amazonaws.com/model-server/inputs/kitten.jpg\"\n    img_path = download_testdata(img_url, \"imagenet_cat.png\", module=\"data\")\n\n    # Resize it to 224x224\n    resized_image = Image.open(img_path).resize((224, 224))\n    img_data = np.asarray(resized_image).astype(\"float32\")\n\n    # ONNX expects NCHW input, so convert the array\n    img_data = np.transpose(img_data, (2, 0, 1))\n\n    # Normalize according to ImageNet\n    imagenet_mean = np.array([0.485, 0.456, 0.406])\n    imagenet_stddev = np.array([0.229, 0.224, 0.225])\n    norm_img_data = np.zeros(img_data.shape).astype(\"float32\")\n    for i in range(img_data.shape[0]):\n   \t    norm_img_data[i, :, :] = (img_data[i, :, :] / 255 - imagenet_mean[i]) / imagenet_stddev[i]\n\n    # Add batch dimension\n    img_data = np.expand_dims(norm_img_data, axis=0)\n\n    # Save to .npz (outputs imagenet_cat.npz)\n    np.savez(\"imagenet_cat\", data=img_data)\n\n\n"
      ]
    },
    {
      "cell_type": "markdown",
      "metadata": {},
      "source": [
        "Running the Compiled Module\n~~~~~~~~~~~~~~~~~~~~~~~~~~~\n\nWith both the model and input data in hand, we can now run TVMC to make a\nprediction:\n\n.. code-block:: bash\n\n    tvmc run \\\n    --inputs imagenet_cat.npz \\\n    --output predictions.npz \\\n    resnet50-v2-7-tvm.tar\n\nRecall that the `.tar` model file includes a C++ library, a description of\nthe Relay model, and the parameters for the model. TVMC includes the TVM\nruntime, which can load the model and make predictions against input. When\nrunning the above command, TVMC outputs a new file, ``predictions.npz``, that\ncontains the model output tensors in NumPy format.\n\nIn this example, we are running the model on the same machine that we used\nfor compilation. In some cases we might want to run it remotely via an RPC\nTracker. To read more about these options please check ``tvmc run --help``.\n\n"
      ]
    },
    {
      "cell_type": "markdown",
      "metadata": {},
      "source": [
        "Output Post-Processing\n~~~~~~~~~~~~~~~~~~~~~~\n\nAs previously mentioned, each model will have its own particular way of\nproviding output tensors.\n\nIn our case, we need to run some post-processing to render the outputs from\nResNet 50 V2 into a more human-readable form, using the lookup-table provided\nfor the model.\n\nThe script below shows an example of the post-processing to extract labels\nfrom the output of our compiled module.\n\n.. code-block:: python\n    :caption: postprocess.py\n    :name: postprocess.py\n\n    #!python ./postprocess.py\n    import os.path\n    import numpy as np\n\n    from scipy.special import softmax\n\n    from tvm.contrib.download import download_testdata\n\n    # Download a list of labels\n    labels_url = \"https://s3.amazonaws.com/onnx-model-zoo/synset.txt\"\n    labels_path = download_testdata(labels_url, \"synset.txt\", module=\"data\")\n\n    with open(labels_path, \"r\") as f:\n        labels = [l.rstrip() for l in f]\n\n    output_file = \"predictions.npz\"\n\n    # Open the output and read the output tensor\n    if os.path.exists(output_file):\n        with np.load(output_file) as data:\n            scores = softmax(data[\"output_0\"])\n            scores = np.squeeze(scores)\n            ranks = np.argsort(scores)[::-1]\n\n            for rank in ranks[0:5]:\n                print(\"class='%s' with probability=%f\" % (labels[rank], scores[rank]))\n\nRunning this script should produce the following output:\n\n.. code-block:: bash\n\n    python postprocess.py\n\n    # class='n02123045 tabby, tabby cat' with probability=0.610553\n    # class='n02123159 tiger cat' with probability=0.367179\n    # class='n02124075 Egyptian cat' with probability=0.019365\n    # class='n02129604 tiger, Panthera tigris' with probability=0.001273\n    # class='n04040759 radiator' with probability=0.000261\n\nTry replacing the cat image with other images, and see what sort of\npredictions the ResNet model makes.\n\n"
      ]
    },
    {
      "cell_type": "markdown",
      "metadata": {},
      "source": [
        "Automatically Tuning the ResNet Model\n-------------------------------------\n\nThe previous model was compiled to work on the TVM runtime, but did not\ninclude any platform specific optimization. In this section, we will show you\nhow to build an optimized model using TVMC to target your working platform.\n\nIn some cases, we might not get the expected performance when running\ninferences using our compiled module.  In cases like this, we can make use of\nthe auto-tuner, to find a better configuration for our model and get a boost\nin performance. Tuning in TVM refers to the process by which a model is\noptimized to run faster on a given target. This differs from training or\nfine-tuning in that it does not affect the accuracy of the model, but only\nthe runtime performance. As part of the tuning process, TVM will try running\nmany different operator implementation variants to see which perform best.\nThe results of these runs are stored in a tuning records file, which is\nultimately the output of the ``tune`` subcommand.\n\nIn the simplest form, tuning requires you to provide three things:\n\n- the target specification of the device you intend to run this model on\n- the path to an output file in which the tuning records will be stored, and\n  finally\n- a path to the model to be tuned.\n\nThe example below demonstrates how that works in practice:\n\n.. code-block:: bash\n\n    tvmc tune \\\n    --target \"llvm\" \\\n    --output resnet50-v2-7-autotuner_records.json \\\n    resnet50-v2-7.onnx\n\nIn this example, you will see better results if you indicate a more specific\ntarget for the `--target` flag.  For example, on an Intel i7 processor you\ncould use `--target llvm -mcpu=skylake`. For this tuning example, we are\ntuning locally on the CPU using LLVM as the compiler for the specified\nachitecture.\n\nTVMC will perform a search against the parameter space for the model, trying\nout different configurations for operators and choosing the one that runs\nfastest on your platform. Although this is a guided search based on the CPU\nand model operations, it can still take several hours to complete the search.\nThe output of this search will be saved to the\n`resnet50-v2-7-autotuner_records.json` file, which will later be used to\ncompile an optimized model.\n\n<div class=\"alert alert-info\"><h4>Note</h4><p>Defining the Tuning Search Algorithm\n\n  By default this search is guided using an `XGBoost Grid` algorithm.\n  Depending on your model complexity and amount of time avilable, you might\n  want to choose a different algorithm. A full list is available by\n  consulting ``tvmc tune --help``.</p></div>\n\nThe output will look something like this for a consumer-level Skylake CPU:\n\n.. code-block:: bash\n\n  tvmc tune   --target \"llvm -mcpu=broadwell\"   --output resnet50-v2-7-autotuner_records.json   resnet50-v2-7.onnx\n  # [Task  1/24]  Current/Best:    9.65/  23.16 GFLOPS | Progress: (60/1000) | 130.74 s Done.\n  # [Task  1/24]  Current/Best:    3.56/  23.16 GFLOPS | Progress: (192/1000) | 381.32 s Done.\n  # [Task  2/24]  Current/Best:   13.13/  58.61 GFLOPS | Progress: (960/1000) | 1190.59 s Done.\n  # [Task  3/24]  Current/Best:   31.93/  59.52 GFLOPS | Progress: (800/1000) | 727.85 s Done.\n  # [Task  4/24]  Current/Best:   16.42/  57.80 GFLOPS | Progress: (960/1000) | 559.74 s Done.\n  # [Task  5/24]  Current/Best:   12.42/  57.92 GFLOPS | Progress: (800/1000) | 766.63 s Done.\n  # [Task  6/24]  Current/Best:   20.66/  59.25 GFLOPS | Progress: (1000/1000) | 673.61 s Done.\n  # [Task  7/24]  Current/Best:   15.48/  59.60 GFLOPS | Progress: (1000/1000) | 953.04 s Done.\n  # [Task  8/24]  Current/Best:   31.97/  59.33 GFLOPS | Progress: (972/1000) | 559.57 s Done.\n  # [Task  9/24]  Current/Best:   34.14/  60.09 GFLOPS | Progress: (1000/1000) | 479.32 s Done.\n  # [Task 10/24]  Current/Best:   12.53/  58.97 GFLOPS | Progress: (972/1000) | 642.34 s Done.\n  # [Task 11/24]  Current/Best:   30.94/  58.47 GFLOPS | Progress: (1000/1000) | 648.26 s Done.\n  # [Task 12/24]  Current/Best:   23.66/  58.63 GFLOPS | Progress: (1000/1000) | 851.59 s Done.\n  # [Task 13/24]  Current/Best:   25.44/  59.76 GFLOPS | Progress: (1000/1000) | 534.58 s Done.\n  # [Task 14/24]  Current/Best:   26.83/  58.51 GFLOPS | Progress: (1000/1000) | 491.67 s Done.\n  # [Task 15/24]  Current/Best:   33.64/  58.55 GFLOPS | Progress: (1000/1000) | 529.85 s Done.\n  # [Task 16/24]  Current/Best:   14.93/  57.94 GFLOPS | Progress: (1000/1000) | 645.55 s Done.\n  # [Task 17/24]  Current/Best:   28.70/  58.19 GFLOPS | Progress: (1000/1000) | 756.88 s Done.\n  # [Task 18/24]  Current/Best:   19.01/  60.43 GFLOPS | Progress: (980/1000) | 514.69 s Done.\n  # [Task 19/24]  Current/Best:   14.61/  57.30 GFLOPS | Progress: (1000/1000) | 614.44 s Done.\n  # [Task 20/24]  Current/Best:   10.47/  57.68 GFLOPS | Progress: (980/1000) | 479.80 s Done.\n  # [Task 21/24]  Current/Best:   34.37/  58.28 GFLOPS | Progress: (308/1000) | 225.37 s Done.\n  # [Task 22/24]  Current/Best:   15.75/  57.71 GFLOPS | Progress: (1000/1000) | 1024.05 s Done.\n  # [Task 23/24]  Current/Best:   23.23/  58.92 GFLOPS | Progress: (1000/1000) | 999.34 s Done.\n  # [Task 24/24]  Current/Best:   17.27/  55.25 GFLOPS | Progress: (1000/1000) | 1428.74 s Done.\n\nTuning sessions can take a long time, so ``tvmc tune`` offers many options to customize your tuning\nprocess, in terms of number of repetitions (``--repeat`` and ``--number``, for example), the tuning\nalgorithm to be used, and so on. Check ``tvmc tune --help`` for more information.\n\n\n"
      ]
    },
    {
      "cell_type": "markdown",
      "metadata": {},
      "source": [
        "Compiling an Optimized Model with Tuning Data\n----------------------------------------------\n\nAs an output of the tuning process above, we obtained the tuning records\nstored in ``resnet50-v2-7-autotuner_records.json``. This file can be used in\ntwo ways:\n\n- As input to further tuning (via ``tvmc tune --tuning-records``).\n- As input to the compiler\n\nThe compiler will use the results to generate high performance code for the\nmodel on your specified target. To do that we can use ``tvmc compile\n--tuning-records``. Check ``tvmc compile --help`` for more information.\n\nNow that tuning data for the model has been collected, we can re-compile the\nmodel using optimized operators to speed up our computations.\n\n.. code-block:: bash\n\n  tvmc compile \\\n  --target \"llvm\" \\\n  --tuning-records resnet50-v2-7-autotuner_records.json  \\\n  --output resnet50-v2-7-tvm_autotuned.tar \\\n  resnet50-v2-7.onnx\n\nVerify that the optimized model runs and produces the same results:\n\n.. code-block:: bash\n\n  tvmc run \\\n  --inputs imagenet_cat.npz \\\n  --output predictions.npz \\\n  resnet50-v2-7-tvm_autotuned.tar\n\n  python postprocess.py\n\nVerifying that the predictions are the same:\n\n.. code-block:: bash\n\n  # class='n02123045 tabby, tabby cat' with probability=0.610550\n  # class='n02123159 tiger cat' with probability=0.367181\n  # class='n02124075 Egyptian cat' with probability=0.019365\n  # class='n02129604 tiger, Panthera tigris' with probability=0.001273\n  # class='n04040759 radiator' with probability=0.000261\n\n"
      ]
    },
    {
      "cell_type": "markdown",
      "metadata": {},
      "source": [
        "Comparing the Tuned and Untuned Models\n--------------------------------------\n\nTVMC gives you tools for basic performance benchmarking between the models.\nYou can specify a number of repetitions and that TVMC report on the model run\ntime (independent of runtime startup). We can get a rough idea of how much\ntuning has improved the model performance. For example, on a test Intel i7\nsystem, we see that the tuned model runs 47% faster than the untuned model:\n\n.. code-block:: bash\n\n  tvmc run \\\n  --inputs imagenet_cat.npz \\\n  --output predictions.npz  \\\n  --print-time \\\n  --repeat 100 \\\n  resnet50-v2-7-tvm_autotuned.tar\n\n  # Execution time summary:\n  # mean (ms)   max (ms)    min (ms)    std (ms)\n  #     92.19     115.73       89.85        3.15\n\n  tvmc run \\\n  --inputs imagenet_cat.npz \\\n  --output predictions.npz  \\\n  --print-time \\\n  --repeat 100 \\\n  resnet50-v2-7-tvm.tar\n\n  # Execution time summary:\n  # mean (ms)   max (ms)    min (ms)    std (ms)\n  #    193.32     219.97      185.04        7.11\n\n\n"
      ]
    },
    {
      "cell_type": "markdown",
      "metadata": {},
      "source": [
        "Final Remarks\n-------------\n\nIn this tutorial, we presented TVMC, a command line driver for TVM. We\ndemonstrated how to compile, run, and tune a model. We also discussed the\nneed for pre and post-processing of inputs and outputs. After the tuning\nprocess, we demonstrated how to compare the performance of the unoptimized\nand optimize models.\n\nHere we presented a simple example using ResNet 50 V2 locally. However, TVMC\nsupports many more features including cross-compilation, remote execution and\nprofiling/benchmarking.\n\nTo see what other options are available, please have a look at ``tvmc\n--help``.\n\nIn the next tutorial, `Compiling and Optimizing a Model with the Python\nInterface <auto_tuning_with_pyton>`_, we will cover the same compilation\nand optimization steps using the Python interface.\n\n"
      ]
    }
  ],
  "metadata": {
    "kernelspec": {
      "display_name": "Python 3",
      "language": "python",
      "name": "python3"
    },
    "language_info": {
      "codemirror_mode": {
        "name": "ipython",
        "version": 3
      },
      "file_extension": ".py",
      "mimetype": "text/x-python",
      "name": "python",
      "nbconvert_exporter": "python",
      "pygments_lexer": "ipython3",
      "version": "3.6.9"
    }
  },
  "nbformat": 4,
  "nbformat_minor": 0
}